{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Dirichlet Distribution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-30T18:32:41.753600Z",
     "start_time": "2018-11-30T18:32:41.741408Z"
    },
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import warnings\n",
    "from collections import OrderedDict\n",
    "from pathlib import Path\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "# Visualization\n",
    "from ipywidgets import interact, FloatSlider\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-30T18:32:57.047779Z",
     "start_time": "2018-11-30T18:32:57.038865Z"
    }
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "plt.style.use('ggplot')\n",
    "plt.rcParams['figure.figsize'] = (14.0, 8.7)\n",
    "warnings.filterwarnings('ignore')\n",
    "pd.options.display.float_format = '{:,.2f}'.format"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Simulate Dirichlet Distribution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Dirichlet distribution produces probability vectors that can be used with discrete distributions. That is, it randomly generates a given number of values that are positive and sum to one. It has a parameter 𝜶  of positive real value that controls the concentration of the probabilities. Values closer to zero mean that only a few values will be positive and receive most probability mass. \n",
    "\n",
    "The following simulation let's you interactively explore how different parameter values affect the resulting probability distributions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-30T18:32:58.986986Z",
     "start_time": "2018-11-30T18:32:58.197387Z"
    },
    "hide_input": false,
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "1f076c57d1ff4decaae28960325cc703",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "interactive(children=(FloatSlider(value=1.0, continuous_update=False, description='Alpha', min=0.01, step=0.01…"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "f=FloatSlider(value=1, min=1e-2, max=1e2, step=1e-2, continuous_update=False, description='Alpha')\n",
    "@interact(alpha=f)\n",
    "def sample_dirichlet(alpha):\n",
    "    topics = 10\n",
    "    draws= 9\n",
    "    alphas = np.full(shape=topics, fill_value=alpha)\n",
    "    samples = np.random.dirichlet(alpha=alphas, size=draws)\n",
    "    \n",
    "    fig, axes = plt.subplots(nrows=3, ncols=3, sharex=True, sharey=True)\n",
    "    axes = axes.flatten()\n",
    "    plt.setp(axes, ylim=(0, 1))\n",
    "    for i, sample in enumerate(samples):\n",
    "        axes[i].bar(x=list(range(10)), height=sample, color=sns.color_palette(\"Set2\", 10))\n",
    "    fig.suptitle('Dirichlet Allocation | 10 Topics, 9 Samples')\n",
    "    fig.tight_layout()\n",
    "    plt.subplots_adjust(top=.95)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "hide_input": false,
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  },
  "name": "_merged",
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "100px",
    "left": "85.9915px",
    "right": "1064px",
    "top": "66.3352px",
    "width": "241.562px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
